Search CORE

12 research outputs found

Practical Identifiability of Finite Mixtures of Multivariate Bernoulli Distributions

Author: Carreira-Perpinan Miguel A
Renals Steve
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2000
Field of study

The class of finite mixtures of multivariate Bernoulli distributions is known to be nonidentifiable; that is, different values of the mixture parameters can correspond to exactly the same probability distribution. In principle, this would mean that sample estimates using this model would give rise to different interpretations. We give empirical support to the fact that estimation of this class of mixtures can still produce meaningful results in practice, thus lessening the importance of the identifiability problem. We also show that the expectation-maximization algorithm is guaranteed to converge to a proper maximum likelihood estimate, owing to a property of the log-likelihood surface. Experiments with synthetic data sets show that an original generating distribution can be estimated from a sample. Experiments with an electropalatography data set show important structure in the data

CiteSeerX

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer

A latent-variable modelling approach to the acoustic-to-articulatory mapping problem. I

Author: Carreira-Perpinan Miguel A
Renals Steve
Publication venue: International Congress of Phonetic Sciences
Publication date: 01/01/1999
Field of study

We present a latent variable approach to the acoustic-to-articulatory mapping problem, where different vocal tract configurations can give rise to the same acoustics. In latent variable modelling, the combined acoustic and articulatory data are assumed to have been generated by an underlying low-dimensional process. A parametric probabilistic model is estimated and mappings are derived from the respective conditional distributions. This has the advantage over other methods, such as articulatory codebooks or neural networks, of directly addressing the nonuniqueness problem. We demonstrate our approach with electropalatographic and acoustic data from the ACCOR database

CiteSeerX

Edinburgh Research Archive

Experimental Evaluation of Latent Variable Models for Dimensionality Reduction

Author: Carreira-Perpinan Miguel A
Renals Steve
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

We use electropalatographic (EPG) data as a test bed for dimensionality reduction methods based in latent variable modelling, in which an underlying lower dimension representation is inferred directly from the data. Several models (and mixtures of them) are investigated, including factor analysis and the generative topographic mapping. Experiments indicate that nonlinear latent variable modelling reveals a low-dimensional structure in the data inaccessible to the investigated linear model

CiteSeerX

Crossref

Edinburgh Research Archive

Structured Multi-Hashing for Model Compression

Author: Carreira-Perpinan Miguel A.
Eban Elad
Idelbayev Yerlan
Movshovitz-Attias Yair
Poon Andrew
Sandler Mark
Wu Hao
Publication venue
Publication date: 25/11/2019
Field of study

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this limitation by reducing the memory footprint, latency, or energy consumption of a model with minimal impact on accuracy. We focus on the task of reducing the number of learnable variables in the model. In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of model size of any deep network and is trained end-to-end. We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and MobileNet architecture families. Our method allows us to drastically decrease the number of variables while maintaining high accuracy. For instance, by applying our approach to EfficentNet-B4 (16M parameters) we reduce it to to the size of B0 (5M parameters), while gaining over 3% in accuracy over B0 baseline. On the commonly used benchmark CIFAR10 we reduce the ResNet32 model by 75% with no loss in quality, and are able to do a 10x compression while still achieving above 90% accuracy.Comment: Elad and Yair contributed equally to the paper. They jointly proposed the idea of structured-multi-hashing. Elad: Wrote most of the code and ran most of the experiments Yair: Main contributor to the manuscript Hao: Coding and experiments Yerlan: Coding and experiments Miguel: advised Yerlan about optimization and model compression Mark:experiments Andrew: experiment

arXiv.org e-Print Archive

Crossref

Practical identifiability of finite mixtures of multivariate Bernoulli distributions

Author: Miguel A. Carreira-Perpinan
Steve Renals
Publication venue
Publication date
Field of study

The class of finite mixtures of multivariate Bernoulli distributions is known to be nonidentifiable, i.e., different values of the mixture parameters can correspond to exactly the same probability distribution. In principle, this would mean that sample estimates using this model would give rise to different interpretations. We give empirical support to the fact that estimation of this class of mixtures can still produce meaningful results in practice, thus lessening the importance of the identifiability problem. We also show that the EM algorithm is guaranteed to converge to a proper maximum likelihood estimate, owing to a property of the log-likelihood surface. Experiments with synthetic data sets show that an original generating distribution can be estimated from a sample. Experiments with an electropalatography (EPG) data set show important structure in the data. 1 Introduction Finite mixtures of multivariate Bernoulli distributions have been extensively used in diverse fields (suc..

CiteSeerX

On Contrastive Divergence Learning

Author: Geoffrey E. Hinton
Miguel A. Carreira-Perpinan
Publication venue
Publication date
Field of study

Maximum-likelihood (ML) learning of Markov random fields is challenging because it requires estimates of averages that have an exponential number of terms. Markov chain Monte Carlo methods typically take a long time to converge on unbiased estimates, but Hinton (2002) showed that if the Markov chain is only run for a few steps, the learning can still work well and it approximately minimizes a di#erent function called "contrastive divergence" (CD). CD learning has been successfully applied to various types of random fields. Here, we study the properties of CD learning and show that it provides biased estimates in general, but that the bias is typically very small. Fast CD learning can therefore be used to get close to an ML solution and slow ML learning can then be used to fine-tune the CD solution

CiteSeerX

Are Visual Cortex Maps Optimised for Coverage?

Author: Geoffrey J. Goodhill
Miguel A. Carreira-Perpinan
Miguel Á. Carreira-perpiñán
Publication venue
Publication date
Field of study

The elegant regularity of maps of variables such as ocular dominance, orientation and spatial frequency in primary visual cortex have prompted many people to suggest their structure could be explained by an optimisation principle. Up to now, the standard way to test this hypothesis has been to generate artificial maps by optimising an hypothesised objective function, and then to compare these artificial maps with real maps using a variety of quantitative criteria. If the artificial maps are similar to the real maps, this provides some evidence that the real cortex may be optimising a similar function to the one hypothesised. However, recently a more direct method has been proposed for testing whether real maps represent local optima of an objective function (Swindale et al., 2000). In this approach, the value of the hypothesised function is calculated for a real map, and then the real map is perturbed in certain ways and the function recalculated. If each of these perturbations leads to a worsening of the function, it is tempting to conclude that the real map is quite likely to represent a local optimum of that function. In the current paper we argue that such perturbation results provide only weak evidence in favour of the optimisation hypothesis

CiteSeerX

Predicting Tongue Shapes from a Few Landmark Locations

Author: Carreira-Perpinan Miguel A
Qin C.
Renals Steve
Richmond Korin
Wrench Alan
Publication venue
Publication date: 01/01/2008
Field of study

We present a method for predicting the midsagittal tongue contour from the locations of a few landmarks (metal pellets) on the tongue surface, as used in articulatory databases such as MOCHA and the Wisconsin XRDB. Our method learns a mapping using ground-truth tongue contours derived from ultrasound data and drastically improves over spline interpolation. We also determine the optimal locations of the landmarks, and the number of landmarks required to achieve a desired prediction error: 3-4 landmarks are enough to achieve 0.3-0.2 mm error per point on the tongue

CiteSeerX

Edinburgh Research Archive

Edinburgh Research Explorer

Queen Margaret University eResearch